cd/entity/RTX 3090Ā· home› entities› RTX 3090
grep -l @rtx 3090 /news/*.json | wc -l → 2

@RTX 3090

mentions 2 type Person feed RSS
03:39
2026-05-23
dev.to
large-language-models

BeeLlama v0.2.0: 164 tok/s on a 27B model, one RTX 3090

BeeLlama v0.2.0 demonstrates that speculative decoding can achieve a 4.4x to 4.93x throughput multiplier on a single RTX 3090, running 27B and 31B parameter models at 37-36 tokens per second baseline …

// co-occurs with top 8 entities